This article explores how multi-collinearity can damage causal inferences in marketing mix modeling and provides methods to address it, including Bayesian priors and random budget adjustments.
This article explains how adding monotonic constraints to traditional ML models can make them more reliable for causal inference, illustrated with a real estate example.
- Study on insect wing hinge control mechanics was conducted by researchers at California Institute of Technology.
- The study utilized a genetically encoded calcium indicator to image steering muscles activity in flies while tracking 3D wing motion.
- A Convolutional Neural Network (CNN) was trained to predict wing motion from steering muscle activity and wingbeat frequency.
- An encoder-decoder was employed to predict the role of individual sclerites on wing motion.
- Virtual experiments were carried out to assess the impact of modulating wing motion via steering muscle activity on aerodynamic forces.
- The study concludes that the insect wing hinge is a complex and evolutionarily significant skeletal structure.
This article explains how to use Accumulated Local Effect Plots (ALEs) to understand the relationship between features and target in machine learning models, particularly when dealing with highly correlated features.
This article explains permutation feature importance (PFI), a popular method for understanding feature importance in explainable AI. The author walks through calculating PFI from scratch using Python and XGBoost, discussing the rationale behind the method and its limitations.
Discussion on the efficiency of Random Forest algorithms for PCA and Feature Importance. By Christopher Karg for Towards Data Science.
Cool question - and yes, you're right that you can use the summary command to inspect feature_importances for some of the models (e.g. RandomForestClassifier). Other models may not support the same type of summary however.
You should also check out the FieldSelector algorithm which is really useful for this problem. Under the hood, it uses ANOVA & F-Tests to estimate the linear dependency between variables. Although its univariate (not capturing any interactions between variables), it still can provide a good baseline from choosing a handful of features from hundreds.